INPUT SUBSPACE DETECTION FOR DIMENSION REDUCTION IN 
HIGH DIMENSIONAL APPROXIMATION 

PAUL G. CONSTANTINE* AND QIQI WANQt 

Abstract. Many multivariate functions encountered in practice vary primarily along a few 
directions in the space of input parameters. When these directions correspond with coordinate 
directions, one may apply global sensitivity measures to determine the parameters with the greatest 
contribution to the function's variability. However, these methods perform poorly when the directions 
of variability are not aligned with the natural coordinates of the input space. We present a method 
for detecting the directions of variability of a function using evaluations of its derivative with respect 
to the input parameters. We demonstrate how to exploit these directions to construct a surrogate 
function that depends on fewer variables than the original function, thus reducing the dimension 
of the original problem. We apply this procedure to an exercise in uncertainty quantification using 
an elliptic PDE with a model for the coefficients that depends on 250 independent parameters. 
The dimension reduction procedure identifies a 5-dimensional subspace suitable for constructing 
surrogates. 

Key words, dimension reduction, high dimensional approximation, interpolation, surrogate 
models 

1. Introduction & Motivation. In modern science and engineering practice, 
computational simulation is routinely employed to help test hypotheses and explore 
new designs. As the speed and capability of computers increase, so does the com- 
plexity of simulations through greater resolution and higher fidelity physical models. 
Expensive simulations requiring extensive time on massive supercomputers are now 
commonplace. Due to the cost of these high-fidelity simulations, one often wishes 
to approximate the output at many points in the space of inputs using a surrogate 
function or a meta-model. The parameters of the surrogate are tuned with a budget- 
constrained number of costly high-fidelity runs, and the tuned surrogates are used to 
study sensitivities or uncertainties in the simulation output with respect to variation 
in the input parameters. 

However, many surrogate models suffer from the so-called curse of dimensionality. 
Loosely speaking, the work required to construct and evaluate an accurate surrogate 
increases exponentially as the dimension of the parameter space increases. For ex- 
ample, this curse limits the applicability of polynomial-based surrogates to problems 
with a handful of input parameters. Even methods whose application is independent 
of the dimension of the parameter space - such as radial basis functions or Gaussian 
process models - often perform poorly if the function is not sufficiently smooth and 
the training data are too sparse. 

Fortunately, in many problems of interest with high dimensional input spaces, 
the output often depends on only a few important parameters. Specifically, the vari- 
ability in the output can be attributed to a subset of the inputs. Surrogates can be 
adjusted to take advantage of this anisotropic parameter dependence. A common 
approach - known as global sensitivity analysis ^2 ~ involves a strategy for ranking 
the input variables and biasing the choice of design points to capture the function's 
behavior as the important parameters are varied. In some cases, the ranking pro- 
cedure can use a priori knowledge from the mathematical model. In other cases, it 
requires exploration of the output through sampling. In either case, methods based 
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on variance-based decompositions [10 or high-dimensional model representations [8] 
choose a few important parameters from the full set of inputs. For problems encoun- 
tered in practice, this often results in a dimension reduction of the input space; a 
surrogate can be constructed on a function of fewer variables with significantly less 
work. 

In this paper, we present a generalization of subset selection methods. Namely, we 
seek a low-dimensional linear subspace of the input parameter space that captures the 
majority of the output's variability. The subspace induces a reduced set of coordinates, 
and surrogate functions can be trained on the reduced coordinates to approximate the 
output in the full space. 

More precisely, for a function of interest / = /(s) with s G M^, we seek a function 
g = g{sa) with G such that f ^ g with a < d. The approximate function 
takes the form ^(s^) = /(As^), where A is a d x a matrix representing a linear 
map from to R^. Note that the introduction of g is primarily for notation; each 
evaluation of g is ultimately an evaluation of / at specially chosen input values. 
However, the dependence of g on fewer variables makes it more amenable to surrogate 
approximation. 

A similar idea is proposed in [9 in the context of model reduction for inverse 
problems, but the method for computing the basis vectors that define the subspace 
employs the residual of a system of equations representing a physical model. Our 
method applies to more general multivariate functions, and it is particularly efficient 
if one can easily compute gradients of outputs with respect to inputs. We discuss 
strategies for the case when only function evaluations are available, including an 
intriguing idea of using new matrix completion techniques on a partially sampled 
matrix of finite difference approximations of the gradient. 

2. Input subspace detection and dimension reduction. We assume that 
a given multivariate function /(s) with s G R^ varies primarily along a few directions 
in the input space. However, these directions may not be aligned with the natural 
coordinate system. The goal in this section is to construct a function g that approxi- 
mates / but takes only as many inputs as directions of variability. Our strategy is to 
first determine the directions along which / varies most prominently; we rotate our 
coordinate system according to these directions. We then define g to depend on the 
subset of these rotated coordinates that contain the majority of the variability in /. 

2.1. Directions of variability. Let 1] be a hyperrect angle defined by the vec- 
tors Si and 



We assume without loss of generality that G R^ is the center of mass of Q. Define a 
scalar function f : Q that takes d inputs. Denote an element of 1] by a d-vector 
s = (si, . . . , G ft. For the analysis, we assume that / is analytic in a region 
containing Q. Denote the <i- vector j = j(s) as 



n = {s : s G R"^, S| < s < s^} 



(2.1) 



J 



V/ 



9L 

dsi 



dsd 



(2.2) 



which is the Jacobian of /. Define the d x d matrix C 




(2.3) 
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where we employ a shorthand ds to denote a measure on Note that C is symmetric 
and positive semidefinite, which imphes it has an eigenvalue decomposition 

C = VAV^, A = diag (Ai, . . . , A^), Ai > • • • > A^ > 0. (2.4) 

If is the ith column of V, then 

,2 



A, = 



Cv, = v/ 



/ ifds)v, = [ {vjifv,)ds = [ (fv,)'ds. (2.5) 



We examine the Taylor expansion of / at the point s - 
/(s + /ivi)-/(s) = hj{sfvi 



hvi G fl around the point s, 
+ .... (2.6) 



Taking the root-mean-squared of ( |2.6[ ) and applying ( |2.5[ ), we get 

||/(s + /iv,)-/(s)|U, = 0{h^/)^), (2.7) 
where 11 • \\lo is the standard L2 norm for functions defined on Q. Note that (|2.7|) 



2; this 



implies the following: if A^ =0, then the function / is constant along the direction of 
v^. We can use this flatness to construct a sampling strategy to approximate / on a 
low dimensional manifold of Q. 

As an example, consider the function /(s) = cos(si 
function in plotted in figure |2.1[ The Jacobian of / is 

V/ = [-sin(si + 52) 

The matrix C is then given by 

sin^(5 



- sin(5i 



- 52) defined on 



• S2) sin {si 
■ S2) sin^(si 



^2) 
52) 



/ sm 

-ttJ-tt [sin2(5i 
The eigenvalue decomposition of C from ( |2.3| ) is 

^ ^ \V2/2 -V2/2 
[V2/2 V2/2 



dsi ds2 = 27r 



(2.8 



(2.9) 





"4^2 


0" 















V2/2 V2/2 
-V2/2 V2/2 



(2.10) 



Notice that the normalized vector [a/2/2, a/2/2] - the first eigenvector - precisely 
identifies the direction in the domain along which / varies. Therefore, if we study / 
along the line defined by [a/2/2, a/2/2] then we can understand the variation in / 
over the whole domain through a projection. 

2.1.1. A note on ridge-type functions. The previous function is an example 
of a ridge function, which appear frequently in statistics [4 . A ridge function takes 
the form 



/ = /(a^s) = m, a e 
The Jacobian has a special form in this case: 

df 



s eQ, t 



T 

a s. 



V/ 



dt 



Then 



vAv' 



A 



a J 



dt, 



1, 



(2.11) 



(2.12) 



(2.13) 



where the norm || • || is the standard 2-norm on R^. The eigenvector v is a normalized 
version of a that reveals the direction of variability for /. In this case, C has rank 
one when df/dt ^ 0, and the vector v can be computed with one normalized point 
evaluation of j(s). 
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Fig. 2.1: The function cos(si + S2) varies along the normahzed vector [a/2/2, \/2/2]^ . 



2.2. Dimension Reduction. Assume that the eigenvalue decomposition (2.4) 
of C can be partitioned as 



V = [V, V5] , A 



A. 







(2.14) 



where Va has a columns, V5 has h columns, and a ^ h = d. The columns of 
correspond to directions along which / varies, and the columns of V5 correspond to 
directions along which / is constant. We can construct a rotated coordinate system 
since 

/(s) = /(VV^s) = /(V,V^s + V5V,^s) = /(VaS, + V5S5) = ^(s„S5). (2.15) 

By construction, the value of the function g will not change as S5 varies. One may 
be tempted to fix S5 (say, set S5 = 0) and treat g as function of the a variables s^. 
However, there are two issues we must address. 

2.2.1. Rotated coordinates. First, what values can s^ take? We can linearly 
transform the set of points s G to get a range for s^. In particular, we define the 
set 



Vta = {Sa : = V^s, s G Vt} . 



(2.16) 



Since Vt is convex, Q^a is also convex, but this is about all we can say. The coordinates 
Sa cannot be varied independently within a set of independent intervals like a hyper- 
rectangle, since the transformed domain Via will most likely not be a lower dimensional 
hypercube; imagine taking a photograph of a rotated cube. For this reason, when we 
construct a surrogate on the reduced coordinates s^, it must be fiexible enough to 
handle general convex domains in multiple dimensions; radial basis functions could 



be an appropriate choice. We discuss sampling from the space Via in section 3.4 



2.2.2. The domain of /. We must ensure that all function evaluations of / 
occur at points in the domain Vt] each evaluation of g is ultimately an evaluation of 
/ at specially chosen input values. It is possible that the projection \ a^a = V^V^s 
will not be in 1], and we do not want to assume anything about / outside its domain. 
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(a) cos(0.3si + 0.7s2) (b) Domains 

Fig. 2.2: The two-dimensional domain [— 7r,7r]^ of cos(0.3si + 0.7s2) shown in blue. 
The red shows the projection of the domain onto the direction of variability. The 
green circles show the points where we evaluate g. 



Fortunately, we can take advantage of the flatness of / to ensure that all evaluations 
occur within Q.. In short, for any point \a^a that falls outside the domain of /, we 
can walk back along the directions in which / is constant until we reach a point in 
the domain. 

More precisely, if VaS^ G 1^, then we evaluate / at V^Sa. If VaSa Vt^ then we 
find z G such that V^Sa + V^z G Q.. Now we define g as 

n(^ \ - \ /(VaSa) if V^S^ G . . 

Note that z is often not uniquely determined, but we only need one for each deviant s^. 
If 1] is a hyperrect angle, then a z can be found by solving a suitable linear program; 
see section [3^ We have thus acheived our goal of constructing a function g dependent 
on a < d parameters that behaves like the d-variate function /. 

We demonstrate the rotation and reduction on a slight modification of the previ- 
ous example. Let /(s) = cos(0.3si + 0.7s2) be defined on [— 7r,7r]^ with gradient 

V/ = [-0.3sin(0.35i + 0.7s2) -0.7sin(0.35i + 0.7^2)] . (2.18) 



Figure 2.2 shows the domain [— 7r,7r]^ in blue. The projection of the domain onto 
the line corresponding to the direction of variability of / is shown in red. The red 
circles correspond to a possible sampling of the reduced (one-dimensional) coordinates. 
Notice that some of the projected points fall outside [— tt, tt]^ when transformed back to 
the original two-dimensional space. The green circles show the points in [— 7r,7r]^ that 
are substituted for the red points outside the domain when evaluating the function g. 

3. Computational aspects. We next consider four computational aspects for 
the dimension reduction procedure. We close this section with a practical algorithm 
that summarizes the presentation. 



3.1. Low variability versus no variability. In (2.14), we assume that some 



of the eigenvalues are exactly zero. When this happens, the estimate (2.7) tells us 



6 



P. G. CONSTANTINE AND Q. WANG 



that / is exactly constant along some directions. But what happens when the eigen- 



values are small but not zero? The estimate (2.7) addresses an averaged measure of 
variability along a direction. Like any averaged measure, it does not preclude sharp, 
local variability. It is possible that one could choose to ignore a direction because its 
associated eigenvalue is below a specified tolerance but subsequently discover a sharp 
local feature in / along this direction. 

However, practical considerations like computational budget often dominate the 
concerns when approximating functions in high dimensions. Any well-motivated strat- 
egy to reduce cost is welcome. In this spirit, we treat the magnitudes of the eigenvalues 
as a ranking on the rotated coordinates. If we desire an approximate bivariate function 
^ of /, then we choose the directions associated with the two largest eigenvalues. 

3.2. Approximating the eigenvalues and eigenvectors. For high dimen- 
sional functions found in practice, we expect that there will be a few dominant di- 
rections in the sense described above. We usually are not able to compute the exact 



matrix C from (2.3), but we can approximate it. Assume for now that we can evalu- 
ate the exact Jacobian j(s)-^ given s. Then we can approximate C with a numerical 
quadrature rule. For simplicity, we use a Monte Carlo approximation. For z = 1, . . . , /c, 
let s^ be samples drawn from 1], and compute the d x k matrix 

J=[j(si) ••• j(sfc)]. (3.1) 

Then 

^ mJ^ rr \n 



J2i{s,)j(sif = ^JJ^ (3.2) 



k 

where \Q\ is the volume of Q. The quality of the approximation can be controlled by 
the number of samples k. For the Monte Carlo approximation, the variance of the 
approximation decreases like [7 . 

Results from eigenvalue perturbation theory show that the error in the approxi- 
mate eigenvalues is on the order of the error in the matrix elements [6] . More accurate 
numerical quadrature methods will result in more accurate approximate eigenvalues, 
but many high order (e.g., interpolatory) multivariate quadrature rules suffer from 
the same curse of dimensionality that we wish to avoid. For this reason, we rely on 
Monte Carlo methods. 

If / is constant along some directions, then these directions will be in the null 
space of C when k > i.e., when the number of Jacobian samples is greater than 
the number of parameters of /. The danger with the approximation C is potentially 
overpredicting the dimension of the null space or, equivalently, underpredicting the 
rank of C. In other words, the Jacobian evaluations at the design sites s^ may indicate 
that / is fiat along directions that it actually varies in Q. 

3.3. Approximate Jacobians. Up to this point, we have assumed that the 
Jacobian j(s)-^ was available for computation. This is not true in many cases, par- 
ticularly if / represents the output of a complex physical simulation. We therefore 
address the question of approximating the Jacobian from point evaluations of /. 

If / can be evaluated at will, then a finite difference approximation along the 
original coordinate directions takes d -\- 1 evaluations - one at s^ and one for each 



perturbation. Thus, approximating J from (3.1) takes k{d -\- 1) function evaluations. 



The potential benefits of revealing the directions of variability may justify this cost. 
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particularly if one is faced with a number of evaluations of / that is exponential in d 
to construct an accurate surrogate. 

If evaluations of / are very expensive, then we want to obtain the eigenvectors of 
C with as few as possible. We can potentially use fewer than k{d-\- 1) evaluations by 
employing recently developed methods for matrix completion [3 under the assumption 
that J is row rank deficient, which is equivalent to a rank deficient C; see (3.2). If 
rank ( J) = a - corresponding to a function / with a directions of variability - then 
we can recover J to within the precision of the finite difference approximation by 
computing a constant times a{k + d) entries of J. 

Let P be a subset of the pairs of indices (z, j) with i = 1, . . . , (i and j = 1, . . . , /c, 
where i indexes the coordinates Si and j indexes the design points Sj from ( |3.1[ ) . For 
didxk matrix X, define Pr'(X) to return a vector of the entries of X corresponding to 
the index pairs in V. For a given tolerance e, the singular value thresholding (SVT) 
algorithm [2 seeks a solution to the convex optimization problem 



mmimize 

a 

subject to 



l|x|U 

||Px)(X) - Pp(J) 



<e, 



(3.3) 



where || • ||* is the nuclear norm. Note that the finite difference parameter for the 
approximate Jacobian provides a natural tolerance on the constraints of the convex 
optimization problem. 

In fact, the SVT algorithm returns approximate singular vectors/ values for J, 



which saves the trouble of forming C with J; see (3.2). The left singular vectors of 



J approximate the eigenvectors V, and the singular values approximate the square 



roots of the eigenvalues; see (2.4). We can use the output of the SVT method directly 
to obtain the directions of variability. We demonstrate this approach in the numerical 
examples in section |4j 

We will not comment on the cost of the SVT algorithm; we mention it for the case 
when computing more entries of J through evaluations of / is more expensive than 
running the SVT algorithm. For example, if / is evaluated with an expensive PDE 
simulation, and the dimensions of J are in the tens to thousands, then this approach 
is appropriate. 



3.4. Sampling from To sample from the reduced space defined in (2.16), 
we use a simple acceptance/rejection scheme. We first determine an a-dimensional 
hyperrectangle that contains Qa by solving a independent linear programs. 



mmimize s, 

s 

subject to < s < 



(3.4) 



where is the ith column of V^. Let s* be the minimizer of (3.4). Then we define 
the hyperrectangle Cla as 

















^1 ^1 












< 




< 




> 








_ a a_ 








-v^s* 


> 



(3.5) 



Notice that fta C l^a, and we expect that the volume of the enclosing hyperrectangle 
will be much larger than the volume of Qa in high dimensions. 

To draw a sample from we draw t uniformly from Qa- If V^t G ^1, then we 
set Sa = t. If Vat but there exists a z such that V^t + V^z G ft^ then we also 
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set So = t. To determine if such a point exists, we can attempt to solve the hnear 
program, 

minimize 

s 

subject to Sa = V^s (3.6) 

Si <S< Su. 

If a point s* is found that satisfies the constraints, then z = V^s*. If such a z does 



not exist, then we reject t. Notice that the objective function in (3.6) is essentiahy 
meaningless; it is merely used to set the problem in terms easily entered into a linear 
program solver. 



Each sample from Qa is used to evaluate ^(s^) as in (2.17), which we use to 
construct a surrogate on the low dimensional subspace. 

3.5. A practical algorithm. We have now discussed all the pieces in the pro- 
cedure for approximating / on the low dimensional manifold. 

1. Compute the directions. If one can evaluate j(s), choose points Si G Q 
with z = 1, . . . , /c and compute 

J= [j(si) •■■ j(sfe)], HjjT ^ c = VAV^. (3.7) 



If one can only evaluate /(s), use the procedure from section 3.3 to approxi- 
mate the eigendecomposition of C. 

Determine the directions of variability. Examine the eigenvalues Ai, . . . , A^^ 
and choose a truncation a < d according to their magnitude. (This judgment 
can be difficult to make algorithmically.) Set to be the first a eigenvectors. 
Evaluate g at points in the reduced domain. Use the acceptance/rejection 



method from section 3.4 to choose a set of design points yj G fta for j = 
l,...,n. For each design point, compute gj = g{yj) using ( 2.17[ ). Note 



that each evaluation may require the computation of Zj using the approach 

described in section [331 
4. Approximate / at a point in Q. For a point s G 1^, compute s^ = V^s. 

Approximate ^(s^) using an interpolation procedure on the points {yj} and 

evaluations {gj}. This approximation occurs on the space Qa of reduced 

dimension. Set /(s) to be the approximation of ^(s^). 
Once the first three steps have been completed, the last step can be repeated as 
needed. For example, numerical integration or optimization can be performed on 
/(s) using the surrogate constructed on the reduced space Qa- 

4. Numerical Examples. In this numerical exercise, we perform an uncertainty 
study on an elliptic PDF with a random field model for the coefficients. Such problems 
are common test cases for methods in uncertainty quantiffcation [l] |5] . 

4.1. PDE model, input parameters, and quantity of interest. Consider 
the following linear elliptic PDE. Let = ii(x, s) satisfy 

- V • (aVu) = 1 (4.1) 

on the spatial domain x G [0, 1]^. We set homogeneous Dirichlet boundary conditions 
on the left, top, and bottom of the domain; denote this boundary by Fi. The right 
side of the domain - denoted F2 - has a homogeneous Neuman boundary condition. 
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The log of the coefficients a = a(x, s) of the differential operator are given by a 
truncated Karhunen-Loeve type expansion 

d 

log(a) = ^ (l)iy/o^iSi, (4.2) 

where the Si are independent, identically distributed uniform random variables on 
[—2,2], and the are the eigenpairs of the covariance operator 

with pi = 1 and p2 = 0.05. The small p2 models a short correlation length in the 
vertical coordinate. The decay of the justifies a truncation of d = 250, so that the 
parameter space Q for the problem is the 250-dimensional hypercube [—2,2]^. Three 
realizations of the log of the coefficients a and their corresponding solutions u are 
shown in figures [TT] and [4^ respectively 

Define the linear function Q = Q{s) of the solution 



1 



udyi. (4.4) 



The quantity of interest for the uncertainty study is an approximate density function 
for Q. 

4.2. Finite element discretization. Given a value for the input parameters s, 
we discretize the elliptic problem with a standard linear finite element method using 
Matlab's PDE Toolbox. The discretized domain has 34320 triangles and 17361 
nodes; the eigenfunctions = ^i(x) from ( |4.2[ ) are approximated on this mesh. The 
matrix equation for the discrete solution u = u(s) at the mesh nodes is 

Ku = f , (4.5) 

where K = K(s) is symmetric and positive definite for all s G 1^. We can approximate 
the linear functional Q as 

Q ^ c^u = Q, (4.6) 

where the elements of c are zero except corresponding to nodes on The nonzero 
elements are constant and scaled so that they sum to one; note that c does not depend 
on s. 

4.3. Adjoint variables for derivatives. Since the quantity of interest can be 
written as a linear functional of the solution, we can define adjoint variables that we 
will help us compute the Jacobian of Q with respect to the input parameters s. Notice 
that we can write 

Q = c^u = c^u-y^(Ku-f), (4.7) 



for any constant vector y. Taking the derivative of (4.7) with respect to the input Si 
we get 

dQ T f T f , 



dsi ^ \ dsi / ^ V dsi ds 
rp . f du\ J. f dK. 



\dsi J yds 



u 
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Fig. 4.1: Three realizations of the coefficients log(a(x, s)). 
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Fig. 4.2: Three reahzations of the solution ii(x, s). 



(c) 
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If we choose y to solve the adjoint equation 



K^y = c, 



then 



dQ_ 

dsi 



dsi 



Three realizations of the adjoint variables y = y(s) are shown in figure 4.3 

To approximate the Jacobian VQ at the point s, we compute the finite element 



(4.; 



(4.9) 



solution with ( |4.5| ), solve the adjoint problem ( |4.8[ ), and compute the components 
with 



( |4.9[ ). The derivative of K with respect to Si is easy compute from the derivative 
of a(x, s) and the same finite element discretization. 

4.4. Approximating the subspace. To apply the input reduction with the 
subspace detection technique, we first sample the Jacobian at random points in Vt to 
construct J from (3.1). From a reference computation of 10^ samples, we examine 



the singular values to determine an appropriate truncation. The singular values of 
J are plotted in Figure |4.4[ the decay justfies a truncation after five terms. For 
reference, we also plot the singular values from the Karhunen-Loeve expansion 
(4.2). The more rapid decay of the singular values of J shows that the particular 
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Fig. 4.3: Three realizations of the adjoint variables y = y(s) on the mesh. 
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Fig. 4.4: The singular values of a reference computation of J with 10^ samples along- 
side the singular values of the Karhunen-Loeve type expansion of the coefficients of 



the differential operator from (4.2). 



output quantity of interest depends primarily on fewer variables than the correlated 
random field modeling the coefficients of the differential operator. 

In the remainder of the numerical exercise, we split the reference 10^ samples 
into five groups of 2000 samples. This is to mimic an initial computational budget of 
2000 samples, which we repeat five times to mildly alleviate affects associated with 
a particularly good or bad choice of 2000 samples. Figures will display the results of 
each of the five independent experiments. 

To check convergence of the projection onto the reduced subspace as more Jaco- 
bian samples are added, we compute the left singular vectors of J for m = 100, 200, 300, . 
The difference between the subspaces defined by subsequent sets of samples rrii and 



, 2000. 
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(a) Reference error 
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2000 



Fig. 4.5: The convergence of subspaces defined by taking the first a = 5 left singular 
vectors from five different random samplings of J. The index i corresponds to adding 



more samples to J. Figure 4.5a shows the error between the subspace from a reference 



solution of 10^ samples and the increasing the number of samples in J; see (4.11). 



Fig ure |4.5b| shows the relative change in the subspaces as more samples are added; 
see (|4.10|). 



m^+i is given by 



= \\ya,rn.^,ylrn.^,-ya,rn.ylrn.l (4.10) 



where V^^^. are the first a left singular vecto rs of J app roximate with samples, 
and the norm is the matrix 2-norm. In figures 4.5a|4.5b , we plot S^^^ along with the 



error of the projected subspace compared to the reference solution, 

^ = ||V,,«.Vj_^, - Va,reiVlre{l (4.11) 

where V^^ref are the first a eigenvectors from the reference computation of J. 

Each column of J requires a forward solve, an adjoint solve, and 250 computations 
for the gradient VQ. We can try to reduce the number of derivative computations 



with the SVT algorithm for matrix completion, as described in section 3.3 Using 
the left singular vectors of J computed with m = 2000 samples, we can test the SVT 
method by uniformly subsampling the entries of J. The number of subsampled entries 
is controlled by 7 with < 7 < 1, which is the proportion of entries revealed in the 
incomplete matrix. In figure |4.6[ we plot the difference between the subspace from 



the subsampled J and the subspace from the full J for 7 = 0.1, 0.2, . . . , 0.9, 

^7 = l|Va,^Vf,^-V„Vf||. (4.12) 

For the SVT algorithm, we used the Matlab implementation from [2 with the fol- 
lowing parameters: objective parameter tau=100, stopping criterion tol=le-4, noise 
constraint EPS=le-6, step size delta=l, and maximum iterations inaxiter=1000. 

4.5. Building the low dimensional surrogate. We first map the 2000 design 
sites Si G to get an initial set of design sites = V^s^ G l^a; we have the evaluations 
of Q associated with these points from the initial sampling of the Jacobian. Following 
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Fig. 4.6: The decrease in the error of the subspace defined by the first a = 5 left 
singular vectors as more entries of J with m = 2000 samples are revealed according 



the proportion parameter 7; see (4.12). 



the process outlined in section |3.4[ we sample uniformly from the space Qa to find 
5000 additional design sites in the lower dimension subspace. The acceptance rate 
of the acceptance/rejection scheme is roughly 35% (averaged over the five identical 
experiments), which validates the intuition about the small volume of Qa relative to 
its enclosing hyperrect angle. It took an average of 19201 linear programs to get the 
5000 samples. 



For each sample, we evaluate Q from (4.6). This gives us a total of 7000 points 
in the lower dimensional subspace - 2000 from the original Jacobian evaluations and 
5000 from sampling on the reduced subspace - on which to construct a surrogate. 
We use the kriging toolbox DACE [11^ to build a surrogate on the lower dimensional 
space Qa- 

Since the evaluation of Q is relatively inexpensive in this example, we compare 
the surrogate's prediction of Q with the actual Q on 10^ points chosen uniformly at 
random from Q. The histograms of the log of the surrogate error are shown in figure 



4.7 - one for each of the five experiments. 



4.6. Approximating the desnity function. To approximate the density func- 
tion of Q, we draw samples from the full space ft, use the low dimensional surrogate 
to approximate the output quantity of interest, and build a histogram of the sam- 
ples. Specifically, for a point s G 1^, we compute = V^s. Then we use the low 
dimensional surrogate to approximate Q at s^. Using 10^ such evaluations, we obtain 
reasonably well- converged histograms. In figure 4.8, we plot the histogram of 10^ 
evaluations of Q from the full model alongside the histograms from each surrogate 
experiment. We see that surrogate approximates the bulk of the histogram reasonably 
but loses accuracy near the tails. This is expected; the low dimensional subspace is 
detected by averaged variability. Therefore, we do not expect to capture extremes of 
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Fig. 4.7: Histograms of the loglO of the error between the surrogate approximation 
of Q and the true value of Q at 10^ randomly sampled points from the full space 
Each histogram corresponds to a different random sampling used to construct the 
surrogate. 



Q, and this is reflected in a loss of accuracy in the tails. 

5. Conclusion. We have presented a method for detecting the primary direc- 
tions of variability of a function of many variables. We have described how to exploit 
these directions to construct a surrogate on a low dimensional subspace of the high 
dimensional input space. We demonstrated this procedure on an uncertainty quantifi- 
cation study with a model problem of an elliptic PDE with variable coefficients that 
depend on 250 independent input parameters. 
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